Add model evaluation release binding gates by Peter7896 · Pull Request #1182 · UnitOneAI/SecuritySkills

Peter7896 · 2026-06-05T21:41:33Z

Summary

add evaluation release-binding coverage to model-supply-chain so model artifacts are tied to immutable evaluation dataset revisions/checksums, thresholds, run IDs, evaluator identity, environment, and release decisions
distinguish model cards from release-quality evaluation evidence
expand backdoor detection with canary, trigger, and targeted-slice regression evidence bound to artifact ID, dataset version, run ID, and timestamp
add Not Applicable / Not Evaluable handling for private evaluation data or model classes where trigger tests do not apply
update severity criteria, report template, maturity summary, common pitfalls, tags, and changelog for version 1.0.1

Scope

This addresses #1171. I also posted an attempt comment before implementation: #1171 (comment)

Closes #1171

/claim #1171

Validation

git diff --check (only existing Windows LF-to-CRLF warning)
verified markdown code fence count is even (16)
verified issue-specific markers for evaluation integrity, evaluation release binding, immutable dataset evidence, release-result binding, canary/trigger regression, artifact-to-evaluation binding, Not Evaluable handling, and version 1.0.1

Add model evaluation release binding gates

45e0b75

Peter7896 mentioned this pull request Jun 5, 2026

[REVIEW] model-supply-chain: add evaluation-set and backdoor-regression evidence gates #1171

Open

4 tasks

Peter7896 closed this Jun 6, 2026

Peter7896 deleted the codex/model-supply-eval-gates branch June 6, 2026 04:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add model evaluation release binding gates#1182

Add model evaluation release binding gates#1182
Peter7896 wants to merge 1 commit into
UnitOneAI:mainfrom
Peter7896:codex/model-supply-eval-gates

Peter7896 commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Peter7896 commented Jun 5, 2026

Summary

Scope

Validation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant